Genome-wide methylome stability and parental effects in the worldwide distributed Lombardy poplar

Background Despite the increasing number of epigenomic studies in plants, little is known about the forces that shape the methylome in long-lived woody perennials. The Lombardy poplar offers an ideal opportunity to investigate the impact of the individual environmental history of trees on the methylome. Results We present the results of three interconnected experiments on Lombardy poplar. In the first experiment, we investigated methylome variability during a growing season and across vegetatively reproduced generations. We found that ramets collected over Europe and raised in common conditions have stable methylomes in symmetrical CG-contexts. In contrast, seasonal dynamics occurred in methylation patterns in CHH context. In the second experiment, we investigated whether methylome patterns of plants grown in a non-parental environment correlate with the parental climate. We did not observe a biological relevant pattern that significantly correlates with the parental climate. Finally, we investigated whether the parental environment has persistent carry-over effects on the vegetative offspring’s phenotype. We combined new bud set observations of three consecutive growing seasons with former published bud set data. Using a linear mixed effects analysis, we found a statistically significant but weak short-term, parental carry-over effect on the timing of bud set. However, this effect was negligible compared to the direct effects of the offspring environment. Conclusions Genome-wide cytosine methylation patterns in symmetrical CG-context are stable in Lombardy poplar and appear to be mainly the result of random processes. In this widespread poplar clone, methylation patterns in CG-context can be used as biomarkers to infer a common ancestor and thus to investigate the recent environmental history of a specific Lombardy poplar. The Lombardy poplar shows high phenotypic plasticity in a novel environment which enabled this clonal tree to adapt and survive all over the temperate regions of the world. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-024-01816-1.


Background
The contribution of epigenetic variation to adaptation and evolution in plants has been a point of debate during the past decades [1,2].In contrast to animals, where epigenetic reprogramming occurs in germ cells and early embryos, epigenetic modifications in somatic plant cells are generally stable [3].As flowers are induced from somatic cells, it has been hypothesized that epigenetic modifications can induce an ecological memory that is maintained over many generations [4].Among several epigenetic mechanisms, DNA methylation is the most studied process.It regulates gene expression and silencing of transposable elements.Variation in DNA methylation can thus affect the plant phenotype by modifying gene expression [5].DNA-methylation patterns can be stably transmitted over two or more plant generations [6].However, it is unclear to which degree inherited epigenetic variation is randomly induced or due to environmental conditions experienced during the parental generation [7].Particularly for long-lived woody perennials, information on inherited genome-wide methylation patterns is scarce [8].For example, little is known on how genome-wide DNA methylation patterns (randomly induced or not) vary in woody perennials sharing the same habitat and/or population history.Also, there is a need for better understanding the impact of transmitted epigenetic patterns on the plant phenotype.Neither the magnitude nor the mechanisms behind parental effects are fully understood.
Inherited epigenetic patterns have also been proposed as an explanation for the "missing heritability" [9].Genome-wide association studies (GWAS) of complex plant traits, such as bud set in trees, consistently found that genetic factors only explained a minority of the expected heritable fraction [10].This unidentified heritable fraction is called the "missing heritability" [9].Theoretical frameworks have been developed that quantify the effect of "missing heritability" on complex traits [5,11] but little empirical information currently exists [1,5].For empirically investigating the fitness consequences of epigenetic inheritance, it is important to clearly distinguish between genetic, epigenetic, and other sources of transgenerational variation [1].Genetic replicates (i.e., clonal copies) provide ideal systems to investigate the potential role of epigenetic inheritance in adaptive evolution [1].
The Lombardy poplar (Populus nigra cv."Italica" Duroi) is an excellent study system to investigate how long-lived woody perennials with a prevailing vegetative reproduction can cope with widely contrasting environmental conditions, without variation at the genetic level [12].This clonal variety of P. nigra L. originates from a single mutant male tree and has been distributed worldwide by humans since the beginning of the eighteenth century [13][14][15].The Lombardy poplar is propagated by cuttings from plant material that has been grown locally, sometimes for centuries.It can thus be expected that the largescale vegetative reproduction of this cultivar in space and time may have resulted in the accumulation of lineagespecific epimutations.Epimutations are heritable gains or losses of cytosine methylation that arise independently of DNA sequence changes [16].Cytosine methylation is sometimes gained or lost in a stochastic fashion, resulting in so-called "spontaneous epimutations." Some epimutations might be environmentally-driven and thus non-neutral.Epigenetic changes could potentially affect the plant phenotype and, when transmitted to vegetative offspring, result in transgenerational phenotypic plasticity.Indeed, in the 1-year-old vegetative offspring of the clonal Lombardy poplar and thus in absence of genetic variation, a former study found significant variation in the timing of bud set in a common environment experiment [12].Ramets (i.e., vegetative offspring of a single parent or ortet) collected from Lombardy poplar ortets from colder origins set bud slightly quicker than ramets obtained from ortets from warmer origins [12].Whether this transgenerational phenotypic effect remains stable over several years is not known.Based on a relatively small number of methylation-sensitive amplified fragment length polymorphisms (MS-AFLP), the latter study could not directly link epigenetic variation with the climate at the home site of the parent-of-origin or with the observed variation in bud set phenology [12].In different (non-clonal) study species however, and using more powerful whole-genome bisulfite sequencing (WGBS) methodologies, associations were found at the singlenucleotide methylation level with spatial structure and climate variables in the long-lived Quercus lobata [17,18], in the annual herb Arabidopsis thaliana [19][20][21], in Fragaria vesca [22] and even in the Lombardy poplar [preprints; [23,24].Such observational studies associate epigenetic variation with the environment but they cannot determine whether epigenetic variation is environmentally induced or spontaneous.
Here, we present the results of three interconnected studies on Lombardy poplar; two observational experiments on the methylome and one study on phenotypic plasticity in a common environment experiment.In the first methylome experiment, we investigated methylome dynamics during early plant life.We studied the variability of the methylome during a growing season and across vegetatively reproduced generations.In the second methylome experiment, we investigated whether methylome patterns of plants grown in a non-parental environment correlate with the parental climate.The applied approach to identify differentially methylated regions (DMRs) among plants grouped per parental origin was twofold: (i) within predefined regions and (ii) through a de novo DMR-calling method that covered whole methylomes (i.e., genic and intergenic regions).We adopted the Bioconductor packages edgeR [25] and dmrseq [26] originally designed for mammalian epigenomes to the more intricate context of the plant epigenome.This approach allowed to account for biological variability between sampled groups while controlling FDR at the region level and maximizing the power to detect DMRs.Finally, in the third experiment, we investigated whether the parental environment has persistent carry-over effects on the vegetative offspring's phenotype.We observed timing of bud set in a common environment.We combined new bud set observations of three consecutive growing seasons with former published bud set data [12].Using a linear mixed effects analysis on the total bud set data spanning 4 years, we investigated whether the observed transgenerational phenotypically plastic effect in terms of timing of bud set persisted.
We found considerable variation between the methylomes of Lombardy poplar ramets collected on ortets growing over Europe and grown in a common environment.DNA methylation in symmetrical CG-context was stable across the growing season and stably inherited by vegetative offspring.Remarkably, these stable, likely neutral methylation patterns can be used as a biomarker for tracing a recent common ancestor of genetically identical ramets.In contrast, seasonal methylation dynamics occurred in CHH context.We observed a weak, shortterm parental carry-over effect on the timing of bud set.However, we did not detect correlations between variation in DNA methylation and the parental climate.This study provides new insights in parental effects and in methylome stability within and across growing seasons and across asexual generations in a long-lived, woody perennial.The results contribute to the understanding of transgenerational plasticity and provide further insight into the biological significance of epigenetic diversity in plants.

Experimental set-up
The Lombardy poplar represents one single, male genotype that can be recognized by its narrow, columnar growth form (Fig. 1A).However, since the beginning of the twentieth century, the Lombardy poplar was used in poplar breeding programs with controlled crosses resulting in several other fastigiate cultivars of Populus nigra, e.g., P. nigra cv."Thevestina" Dode [15].We therefore selected plant material for this study that was described and verified as the "true" Lombardy poplar previously [12,27] thus representing one single genotype.
We performed three studies on Lombardy poplar; two observational studies on the methylome and one phenotyping study in a common environment experiment.In the first methylome study (Fig. 2A), we investigated methylome dynamics during early plant life.We investigated methylomes of 16 individual plants (Table 1) using WGBS.The plants were grown in a common greenhouse environment in Geraardsbergen, Belgium (latitude 50,77,635°, longitude 3.881007°).We used DNA samples of individual plants to investigate the stability of the epigenome during a growing season and across vegetatively reproduced generations.The DNA samples from individual plants included (i) DNA samples from first-generation vegetative offspring (F 1 -ramets) collected from Lombardy poplars located in Hungary, Italy, Spain, and the UK, and (ii) DNA samples from second-generation vegetative offspring (F 2 -ramets), propagated from the former F 1 -ramets.We extracted DNA from one single leaf of the F 1 -ramets collected in 2017 at the beginning (March 31) and near the end (July 25) of the growing season.Vegetative propagation of four F 1 -ramets resulted in one to three F 2 -ramets per propagated F 1 -ramet.Of these, we collected one leaf on June 12, 2018, for DNA extraction and WGBS.
In the second methylome study (Fig. 2B), we investigated whether methylome variation within plants grown in a non-parental environment correlates with the parental climate.We performed WGBS on 13 DNA pools.We used DNA pools to investigate whether epigenetic variation within vegetative offspring in a non-parental environment correlates with parental climatic conditions.We prepared 13 pools of DNA by mixing the same amount of DNA from 2 to 9 (mean 3.6) individuals.The pools were composed of F 1 -ramets grown in a common environment in Geraardsbergen (Belgium) and originally collected on Lombardy poplar ortets growing in proximity (< 40 km).These F 1 -ramets were collected on in total 53 Lombardy poplar ortets growing across Europe.We used the average January temperature at the geographic location of the ortet as a proxy for the parental climate.We sequenced pools to reduce inter-individual methylation variation and fluctuations in methylation rate caused by stochastic coverage or read sampling bias [2].The locations of the sampled adult Lombardy poplars (i.e., ortets) are presented in Fig. 1B.Additional file 1 includes detailed information on the Lombardy poplars sampled for WGBS.
Finally, in the third experiment, we investigated whether the parental environment has persistent carryover effects on the vegetative offspring's phenotype.We observed timing of bud set in a common environment (Fig. 2C).We combined new bud set observations of three consecutive growing seasons with former published bud set data [12].Using a linear mixed effects analysis on the total bud set data spanning 4 years, we investigated whether the observed transgenerational phenotypically plastic effect in terms of timing of bud set persisted over several years.

Epigenetic variation and methylome stability
On average 46% (range 44-49%) of the sequence reads obtained by WGBS from the 29 DNA samples were mapped uniquely to the genome of Populus trichocarpa.(See figure on next page.)Fig. 2. A Experimental set-up of the first methylome study.We sampled 16 plants of Lombardy poplar grown in a common environment in Belgium.These plants included first-generation vegetative offspring (F 1 ) grown from ramets collected from four adult Lombardy poplars (F 0 ) located in Hungary, Italy, Spain, and the UK, respectively, and second-generation vegetative offspring (F 2 ).We took leaf samples in 2017 on March 31 and July 25, and in 2018 on June 12.We extracted DNA from one leaf per plant and performed whole-genome bisulfite sequencing (WGBS).We used clustering analysis to study the methylome stability across a growing season and across vegetative generations.We studied differentially methylated regions (DMRs) by grouping the WGBS data by the F 0 -parent.B Experimental set-up of the second methylome study.We sampled 53 plants of Lombardy poplar grown in a common environment in Belgium.These plants included first-generation vegetative offspring (F 1 ) grown from ramets collected from 53 adult Lombardy poplars (F 0 ) located over Europe.We took leaf samples in March 2017 and we extracted DNA from one leaf per plant.We prepared 13 DNA pools from 2 to 9 F 1 -plants originating from F 0 -Lombardy poplars growing in proximity.We performed WGBS and searched for DMRs by grouping the WGBS data based on the climate of the F 0 -parent, using the average January temperature as a proxy for the parental climate, resulting in three groups.C Experimental set-up of the common environment experiment.We observed timing of bud set on 791 plants of Lombardy poplar grown in a common environment in Belgium using a seven-stage scoring.These plants included F 1 -ramets collected from 67 adult Lombardy poplars (F 0 ) grown on 28 different locations in 13 different countries over Europe.We used a linear mixed effects analysis on the total bud set data spanning 4 years (2017-2020) to investigate the relationship between bud set phenology of the ramets in the common environment experiment (F 1 ) and the latitude of origin of the home site of the Lombardy poplar parent-of-origin (F 0 ).Icons made by Freepik from www. flati con.com The bisulfite conversion rate exceeded 99% in every sample.The average percentage of total cytosines that were methylated in the 16 individual Lombardy poplar ramets was 22, 11 and 2% in the CpG, CHG, and CHH context, respectively (where H represents A, C or T).Additional file 2 includes a table summarizing the mapping statistics, percentages of cytosine methylation and percentages of bisulfite conversion rates for each of the DNA samples analyzed by WGBS.The dendrogram based on the differentially methylated cytosines in CpG context as well as in CHG context from the individual ramets clearly grouped the F 1 -and the F 2 -ramets according to their parent-of-origin (F 0 ) (i.e., ortet) (Fig. 3A).Similar clustering results were obtained based on the multidimensional scaling analyses (MDS) (Fig. 3B).Methylation in symmetrical CG-context was stable within and across seasons.In contrast, the hierarchical cluster analysis clustered the samples based on differentially methylated cytosines in CHH context according to the sampling period in the growing season; samples taken at the beginning of the growing season (March) clustered clearly separately from the samples taken in the middle of the growing season (June and July).However, the latter pattern was less clear on the scatter plot of the first two dimensions obtained by the MDS.
To explore the functionally more relevant differentially methylated regions (DMRs) compared to DMPs [28], we identified DMRs in predefined regions (gene bodies and  B Multidimensional scaling (MDS).Sample names are explained in Table 1.Different colors represent the four different ortets.Green: "ITS3, " purple: UKD2"; red: "HUN4" and blue: "SPC1″ located in Italy, UK, Hungary and Spain, respectively promoters).In total, we identified in total 2881 statistically significant DMRs (p-value < 0.05) within the predefined genomic gene bodies and promoter regions, for the three sequence contexts (CpG, CHG, and CHH-DMRs) between any of the six pairwise comparisons of the individual ramets grouped per ortet.The majority of these DMRs were found in CpG-and CHG-context (53 and 45%, respectively) and within genes (84.4%).Most of the DMRs were common to two or more pairwise comparisons (63 and 74% for promoter and gene regions, respectively) (Fig. 4A).A list of all DMRs identified in predefined regions is presented in supporting file 1 at Zenodo (63).
The overrepresentation of the (GO) terms and their biological processes associated with genes located in these DMRs are presented in Fig. 4B.GO terms that were enriched in CpG-DMRs located in genes were mainly involved in metabolic and cellular processes (e.g., catalytic activity, receptor binding), whereas the few (0.2%) GO terms in CH-DMRs located in genes were enriched for biological processes involved in regulatory functions (e.g., regulation in gene expression).One gene was differentiated between all pairwise comparisons in CpG context, a gene with a transferase activity, UBIA PRE-NYLTRANSFERASE DOMAIN-CONTAINING PRO-TEIN 1 (PTHR13929:SF0) involved in quinone metabolic process.GO terms that were enriched in CpG-DMRs located in promoters were mainly involved in metabolic and cellular processes (e.g., catalytic activity, receptor binding), whereas few (2%) were involved in biological regulation.Additional file 3 includes the total list of GO terms that were enriched in DMRs.The clustering results and the heatmaps of enriched GO terms for all pairwise comparisons are presented in supporting file 2 at Zenodo (63).Heatmaps of GO terms that were over-represented in CpG-DMRs located in promoters per pairwise comparison of ramets grouped per parent-of-origin are presented in Additional file 4. No specific DMRs were found associated with rhythmic process (GO:0048511) any process pertinent to the generation and maintenance of rhythms in the physiology of an organism (thus including bud set phenology).There are 17 genes classified within this biological process in Populus trichocarpa, none of these occurred in the list of DMRs.
In contrast to the WGBS data of the 16 individual ramets, we did not observe any biological relevant pattern in the clustering of the WGBS data of the 13 pooled DNA samples based on the DMPs.Also, no significant DMRs (neither de novo, nor in predefined regions) were detected between pairwise comparisons of the pooled samples grouped based on the average January temperature at the geographic location of the Lombardy poplar parent-of-origin.Moreover, we found no significant de novo DMRs when comparing, for each sampling period (March vs. June/July), the methylomes of the individual ramets grouped based on the average January temperature at the geographic location of the Lombardy poplar parent-of-origin (i.e., ramets originated from the ortets from Hungary and UK versus ramets from the ortets located in Spain and Italy).

Variation in timing of bud set
Whether transgenerational phenotypic transmission depends on parent-of-origin is important to understand the mechanisms behind phenotypic plasticity [29,30].We grew 791 ramets from 67 adult Lombardy poplar ortets planted across Europe in 28 locations.The ramets were grown in a non-maternal, uniform environment.We analyzed bud set data collected during four consecutive growing seasons.We found a statistical significant but weak short-term, parental carry-over effect of the location of the Lombardy poplar parent-of-origin (in terms of latitude) on the timing of bud set of the ramets in the novel environment.However, this effect was negligible compared to direct effects of the asexual offspring's environment.There was a statistically significant effect of the latitude of origin (Lat) on bud set phenology but the estimated effect was very small and only of biological significance in the first year of the experiment (2017) (Fig. 5A).In 2017, moving ramets northward from the latitude of the parent-of-origin, thus into longer days, resulted in a slightly increase in bud set score and thus a slightly longer growth period for these ramets relative to ramets originating from the latitude nearby the experimental site.Although still statistically significant, the effect size of the latitude of origin (Lat) is near zero in the years 2018, 2019, and 2020 and of non-biological relevance (Fig. 5B).In the first year of the experiment (2017), the effect on bud set phenology of the cumulative daily minimum temperature during plant growth at the experimental site (Tsmin) was about two orders of magnitude larger compared to the effect of the latitude of origin (Lat) and thus clearly of higher importance.The effects sizes of the variables affecting the bud set phenology and maintained in the linear mixed effects model are given in Table 2.The response variable was bud set score and the year 2017 was set as the reference year.The values of the estimates given in Table 2 are thus the effect sizes on the response variable compared to the year 2017.The intercept represents the estimate of the response variable for the year 2017, with the values for the variables latitude (Lat) and Tsminc, set to zero.Only two-way interactions were considered.Additional file 5 contains the raw data of the bud set observations and Additional file 6 includes the source codes to reproduce the results of the bud set data analysis.

Discussion
The methylome as a biomarker for origin The widespread, clonally propagated Lombardy poplar provides a unique opportunity to study epimutation accumulation over long timescales and in variable environments [12].This study found several tens of thousands cytosine methylation polymorphisms (DMPs) between genetically identical ramets grown in a common environment and collected on adult trees (ortets) from diverse sites over Europe.Strikingly, ramets grown in the controlled environment for up to two growing seasons continued to retain genome-wide epigenetic marks in symmetrical CG-contexts (i.e., CpG and CHG, where H = A, C, T) typical of the founder tissue used for the initial propagation.The hierarchical cluster analysis grouped individual ramets' grown in a novel environment by the parent-of-origin, regardless of the ramets' sampling date during the growing season or regardless of their asexual reproduction cycle (F 1, F 2 ).This suggests that genome-wide epigenetic variation in woody perennials in symmetrical CGcontexts is stable and likely mostly random.In contrast, seasonal changes occurred in CHH-methylation.Hierarchical clustering analysis of the individual samples based on DMPs in the asymmetrical CHH context grouped the samples according to the date of sampling during the growing season (March versus June/July), and thus not by the parent-of-origin.The genomewide methylation patterns and stability in the different sequence contexts in the woody perennial, the Lombardy poplar, are in line with these found for herbaceous short-lived plant species.Methylome stability at individual CpG-sites [2] and temperature-dependence variation in methylation at CHH-sites was also found in Arabidopsis species [19,28].Each of the three contexts of methylation is distinctly regulated and maintained by unique pathways [reviewed by [31,32].In plants, methylation occurs mostly in transposable elements and other repetitive DNA sequences (in CpG, CHG, and CHH context) while methylated CG is the most abundant form of cytosine methylation and is often found within transcribed regions (i.e., gene body methylation) [6].Our findings suggest that in woody plants, geographically structured natural variation in epigenetic patterns in CHH context may be the result of the variation in seasonality along a latitudinal gradient (i.e., variation in photoperiod and temperature).
Thus, these genome-wide methylation patterns in CHH context reflect phenotypic adjustments to local conditions or intra-individual phenotypic plasticity (i.e., the ability of one genotype to express varying phenotypes when exposed to different environmental conditions [29]) rather than evolutionary adaptation.
Epigenetic mechanisms might be similar in plant species with different life history traits, but their long-term stability and their potential relevance for adaptation in plant populations might be very different for short-lived versus long-lived plant species.There are large differences in life history traits between herbaceous annuals and woody perennial plants, such as timing of reproduction, age and size at maturity, and growth pattern.These traits influence the per-year mutation rate at the genomic, epigenomic, and transcriptional level [8].The rates of epimutations in poplars are four to five orders of magnitude greater than those of spontaneous somatic mutations.By sampling multiple, age-estimated branches of a single tree of Populus trichocarpa, Hofmeister et al. [8] found that symmetrical CG-methylations are cumulative across somatic development and originate mainly from DNA methylation maintenance errors during mitotic cell divisions.In contrast to the strong similarities in pergeneration (epi)genetic mutation rates, the per-year (epi) genetic mutation rates are lower in the perennial poplar than in the in the herbaceous, annual plant Arabidopsis thaliana [8].The frequent vegetative reproduction and widespread use of the Lombardy poplar by humans may have facilitated epimutation accumulation during mitotic cell divisions.Once established, epimutations can stably persist or be inherited across generations [33].In a study of white poplar (Populus alba), ramets collected on ortets growing at different locations and representing a single clone grouped according to the sampling site based on differentially methylated CG sites generated by Methylation Sensitive Amplified Fragment Length Polymorphisms (MS-AFLPs) [34].This is in agreement with the finding of this study that closely related plants sharing a common ancestor show high similarities in genome-wide symmetrical CG-methylation patterns.Epigenetic patterns in symmetrical CG-contexts can thus be used as a biomarker for identifying a common ancestor or ortet of genetic identical ramets.This may be useful, for example, for providing insights into the range expansion of invasive, clonal plants, like Japanese knotweed.To our knowledge, this is one of the first multigenerational studies that contribute to the recent evidence that spontaneous symmetrical CG-epimutations in long-lived perennials accumulate across asexual generations and at a high rate over time [8] resulting in rapid methylome divergence among asexual plant lineages.These CG-epimutations can be used as a fast-ticking molecular clock, not only for age-dating perennials [8] but also for timing evolutionary events in the recent past [35][36][37] like the construction of clonal phylogenies.Geographically structured epigenetic variation is found in a wide range of organisms [17,20,38].Several observational studies suggest that environment induces epigenetic variation and that selection and local fitness effects have led to the observed epigenetic patterns [17,22,39], but evidence for causal relationships is lacking.Most epimutations are spontaneous and likely neutral [8,16,18,33] though epimutations can result in phenotypic variation in plants [40][41][42].In this study, no significant differences in methylated regions (DMRs) were detected between pairwise comparisons of the methylomes of pooled samples grouped by average temperature in January at the geographic location of the adult Lombardy poplars.By pairwise comparisons of individual methylomes grouped per parent-of-origin, we found significant DMRs within gene bodies and enriched for cellular and metabolic processes.To date, there is no convincing data to support a direct causal link between the presence/ absence of gene body methylation and the modulation of expression in flowering plants [8,33].We also found DMRs within promoters that were enriched for biological processes (e.g., biological regulation, response to stimulus) and that could result in transcriptional consequences effecting the plants' phenotype.In an empirical study testing the effects of drought on transcriptomes in three poplar clones, Raj et al. [43] suggest that clone history can have a profound effect on drought responses.In this study, we cannot make conclusions about environmental effects on the epigenome.In trees, local adaptation along climatic gradients is mostly the result of selection on polygenic traits, involving hundreds or thousands of loci, each with a very small effect, that is very difficult to detect [44].The complex interplay between genotype and environment combined with the limited number of biological replicates, resulted in limited power of this study to identify causal epigenetic variants.However, Zhou et al. [45] detected a functional relationship between DNA-methylation diversity and drought resistance and resilience in Populus tomentosa by combining epigenome-wide association studies with genome-wide association studies.Further research is needed building on our findings to help understand the mechanisms by which DNA methylation epialleles are generated, maintained and erased, and their functional consequences in plants.

Weak, short-term parental effect
Several studies have highlighted the potential role of environmentally induced epigenetic variation on parental effects [46], although empirical evidence is scarce.Here, we experimentally tested the potential effect of epigenetic inheritance on phenotypic variation in terms of bud set in a common garden experiment.We found a significant but very weak and short-term parental carry-over effect on the timing of bud set in the novel environment of vegetative offspring for the Lombardy poplar.The direction of this effect correlated with the latitude of origin of the Lombardy poplars; ramets of Lombardy poplars from more southern locations set bud slightly later in summer than trees from more northern locations [12].Indeed, the movement of a locally adapted poplar genotype from the latitude of origin towards more northern locations, thus into longer days during the growing season, prolongs the period of its active growth while the contrary is observed for movements towards more southern locations [47,48].However, the temperature at the new environment had a much stronger influence on the duration of bud formation compared to the parental effect.Furthermore, the weak parental effect on bud set was reduced to near zero in the three consecutive growing seasons and had little biological relevance after the first growing season.This indicates that the parental effect is very limited compared with the direct effect of the offspring environment and that it becomes negligible after the first growing season.It may therefore rather be an intergenerational effect than a truly transgenerational effect [49].This short-time epigenetic memory observed in this study is therefore unlikely to be responsible for a contribution to heritable variation, the so-called "missing heritability" or the contribution to phenotypic variation in bud set observed in experimental studies that could not be attributed to the genetic architecture [50,51].This is in line with the results of a phenological preprint study, including bud set observations, of the Lombardy poplar by Rodriguez et al. [preprint [23].It also confirms the conclusion of Uller et al. [52] based on a meta-analysis of experimental studies on anticipatory parental effects in plant and animals indicating that parental effects are generally small compared to direct effects of offspring environment, and more subtle than typically assumed.The results of this study show a very high phenotypic plasticity of Lombardy poplar showing a rapid response mechanism in bud phenology that likely enabled this clonal tree to adapt and survive after human introduction all over the temperate regions of the world and even in subtropical environments.

Conclusions
The clone history of Lombardy poplar shapes the genome-wide methylome which can be used as biomarkers to identify a common ancestor of genetically identical trees on short evolutionary timescales.Our results show a rapid plastic response of the Lombardy poplar and a weak short-time memory effect on the timing of bud set of the parental environmental cues.

Whole-genome bisulfite sequencing DNA extraction
Total genomic DNA was extracted from fresh leaves with the Qiagen Plant DNA kit (Hilden, Germany).The integrity of the DNA was assessed on 1.5% agarose gels, and DNA quantification was performed with Quant-iT ™ PicoGreen ® dsDNA Assay Kit (Thermo Fisher Scientific, Massachusetts, USA) using a Synergy HT plate reader (BioTek, Vermont, USA).

Library construction and whole-genome bisulfite sequencing
Up to 250 ng of DNA was fragmented to 300 bp with a Covaris S2 sonicator prior to whole-genome library preparation (NEBNExt Ultra II kit), ligation of methylated adaptors and size selection on 2% E-gels (400-550 bp).Bisulfite conversion was performed using the EZ DNA methylation gold kit (Zymo Research), followed by an enrichment PCR of 12 cycli.Paired-end 100-bp sequencing of the library fragments was performed on an Illumina Novaseq S2 sequencer, generating on average 83,886,466 ± 14,181,156 reads (mean ± SD) and an average read depth of approximately 17x.To determine the bisulfite conversion rate, we aligned the reads treated with bisulfite to the unmethylated chloroplast DNA of P. nigra (Populus nigra isolate LiuJQ-XJ-2011-047 chloroplast, complete genome-Nucleotide-NCBI (nih.gov)).The raw fastq datafiles, the processed data, and the metadata are available at the Gene Expression Omnibus (GEO) database (submission GSE225596) [53].

Identification of methylated sites
The raw fastq-files were first quality-based trimmed and adapter trimmed with TrimGalore v0.6.6 (https:// doi.org/https:// doi.org/ 10. 5281/ zenodo.51278 99).The trimmed reads were subsequently mapped onto the genome of Populus trichocarpa (Pop_tri_v3, Ensembl release 48) using Bismark v0.23.0 [54].We opted for the reference genome Pop_tri_v3 over the genome assembly of Populus nigra cv."Italica" Duroi based on criteria such as completeness, assembly quality, and annotation.As this is the genome of a related species, the mapping parameters were relaxed by specifying the function -score_min L,0,-0.4.Afterwards, the aligned BAM files were deduplicated using Bismark.Lastly, bismark_meth-ylation_extractor and bismark2bedGraph v0.24.0 were used to generate the context-specific methylation states (i.e., CpG, CHG and CHH) at the individual cytosine level for each sample.Only loci that were covered in all samples were retained for further analysis.Additional file 7 represents the different steps of the bioinformatics pipeline (i.e., snakefile).Because it is unclear whether differentially methylated positions have any functional consequences in plants [28], we identified methylated regions.

Identification of differential methylated regions in predefined regions
Plant DNA methylation occurs through the link of a methyl group (-CH3) at cytosine in symmetric context; CpG and CHG, and asymmetric sequence context; CHH (where H is any nucleotide except G).We first summarized the identified methylated sites into known regions; promotors and gene bodies.Thereafter, we tested these predefined regions for differential methylation.The value of this approach has been demonstrated before (see [2]).We used R version 4.1.1[55] for all R packages.We applied the Bioconductor package edgeR [26].The latter package was initially designed for RNA-seq data, subsequently customized for analyzing reduced representation bisulfite sequencing (RRBS) data from mammalian genomes.Here, we extended its use to the more complex plant methylomes.In mammals, the vast majority of methylated cytosines occur in CpG context, while for plants methylation occurs in all three sequence contexts (CpG, CHG, CHH).Identified methylated sites were first assigned to relevant predefined regions with the R package GenomicRanges and methylated regions were identified with the package edgeR v3.40.2.We focus on gene bodies and promoters as predefined regions to limit the number of pairwise comparisons in the next steps.We defined gene body methylation (gbM) as enrichment of CpG-only methylation within gene bodies with a depletion of methylation at transcription start site (TSS) and transcription termination sites (TTS) [6].Promoters were defined as the region between 2000 bp upstream and 200 bp downstream the TSS.Unlike most other approaches to methylation sequencing data, the edgeR workflow keeps the counts for methylated and unmethylated reads as separate observations.Generalized linear models are used to fit the total read count (methylated plus unmethylated) at each genomic locus, in such a way that the proportion of methylated reads at each locus is modeled indirectly as an over-dispersed binomial-like distribution.Keeping methylated and unmethylated read counts as separate data observations allows the variability of the data to be modeled more directly and perhaps more realistically [26].We identified DMRs by grouping the WGBS data from the DNA samples of individual plants by their corresponding parent-of-origin (ortet located in Hungary, Italy, Spain, and the UK, respectively).This resulted in four groups with three (Hungary), (Italy), five (Spain), and four (UK) biological replicates per group.We identified statistically significant DMRs in between-group pairwise comparisons for two predefined regions (genes and promoters) for each of the three sequence contexts (CpG-, CHG, CHH-DMRs), resulting in 36 pairwise comparisons.The Benjamini-Hochberg multiple testing correction was applied to control the false discovery rate (FDR) at a nominal level of 5%.Additional file 8 includes the R-script with the code to reproduce these analyses.

Identification of de novo differentially methylated regions
In addition to identifying DMRs in promotors and gene bodies, we also scanned the methylomes to detect unbiased de novo DMRs.This was done using the Bioconductor package dmrseq v1.18.0 [25].A DMR was called differentially methylated when the adjusted p-value was smaller or equal than 0.05.Lastly, the position of the differentially methylated DMRs was compared to the current genomic annotation.We also applied this package developed for mammalian epigenomes to the more complex plant epigenome.In most former plant studies, computational approaches for detecting DMRs do not provide accurate statistical inference.Methods to identify DMRs in WGBS experiments are greatly hindered by the high-dimensionality and low sample size setting that is common in high-throughput genomics studies.The number of tests performed is equal to the number of loci analyzed, which is very large in typical WGBS studies.The approach implemented in dmrseq improves the specificity and sensitivity of lists of regions and accurately controls the false discovery rate (FDR) [25].Dmrseq uses a two-stage approach.First, candidate regions are detected and afterwards their statistical significance is evaluated.In brief, dmrseq starts by smoothing the signal, because neighboring cytosines are highly correlated.Subsequently, candidate regions are defined by segmenting the genome groups of cytosines that show consistent evidence of differential methylation.In the second stage, a statistic is computed for each candidate region.This statistic takes into account variability between biological replicates and spatial correlation among neighboring loci [25].To our knowledge, this is one of the first studies applying this approach on plant genomes.Only cytosines that have coverage in all samples were retained.
To investigate whether the epigenetic variation within vegetative offspring in a non-parental environment correlates with the parental climate, we used the (i) WGBS data of the 13 pooled DNA samples and (ii) the WGBS data of the individual ramets.For the pooled samples, we grouped the WGBS data in three groups based on the average temperature in January (JAN) at the geographic location of the Lombardy poplar parent-of-origin.The geographic location of the ortets included in the analysis of the pooled samples is given in Fig. 1B.We used JAN as the representative climate variable.Temperature is an important environmental factor determining plant vitality and growth [56].Cold winters effect tree dormancy and soil water availability and January is generally the coldest month of the year at locations of the sampled Lombardy poplars.The climate variable JAN is correlated with the average annual precipitation (correlation coefficient: 0.6; see Additional file 6 which is another important climate variable.We used data from WorldClim 2 [57] for the period 1965-2015 to characterize the home environment of each ortet.In addition to the 13 pooled samples, we included one individual sample (WGBS04, sample from Spain, Tree_ID SPC1) in this analysis.This resulted in three groups with five (group 1, range JAN (°C); 1.82-3.61),six (group 2, range JAN (°C); − 1.40-0.15),and three (group 3, range JAN (°C); 4.97-8.0)biological replicates, respectively (Fig. 1B).We searched for statistically significant de novo DMRs in between-group pairwise comparisons for each of the three sequence contexts (CpG-, CHG, CHH-DMRs), resulting in nine pairwise comparisons.We performed a similar analysis for the individual ramets sampled at the start (March) and in the middle (June/July) of the growing season, by grouping the WGBS data based on the average January temperature at the geographic location of the Lombardy poplar parent-of-origin (i.e., ramets originated from the ortets from Hungary and UK versus ramets from the ortets located in Spain and Italy).

Gene ontology enrichment analysis
Employing the Pop_tri_v3 reference genome, we expected a lower mapping efficiency compared to the P. nigra reference genome but it enabled us to explore the biological significance of differentially methylated regions (DMRs).To test which biological processes were over-represented in sequences containing DMRs (as compared to the Populus trichocarpa genome), we performed a statistical overrepresentation test in PANTHER Classification System version 16 (http:// panth erdb.org/) [58].We performed Fisher's exact test with FDR correction using the differentially methylated gene features (genes IDs and promoters) as input data and the Populus trichocarpa GO-Slim annotation data as reference data, with a p-value cutoff of 0.05.We further explored the biological importance of the list of genes in significant DMRs with the Bioconductor package simplifyEnrichment v1.8.0 [59].This package implements the binary cut algorithm for clustering similarity matrices of functional GO terms.GO terms that were significantly overrepresented in genes containing DMRs are clustered and their similarities are visualized in a heatmap.The package simplifyEnrichment visualizes the summaries of clusters by word clouds which helps to interpret the common biological functions shared in the clusters.Additional file 9 includes the R-script with the code to reproduce the clustering and visualizing of the GO enrichment results.

Cluster analysis
We performed a clustering analysis based on the methylated cytosines from DNA samples of individual plants to investigate the methylome variability between plants and the stability of the epigenome during a growing season and across vegetatively reproduced generations.We first explored the overall differences in cytosine methylation levels (differential methylated positions, DMPs) using a hierarchical clustering analysis based on Pearson's correlation distance and the Ward method (minimum variance method) with the Bioconductor package methylKit 1.24.0.[60].We used the filtering function "filterByCoverage" as described in the vignette of methylKit, i.e., removing loci with lower than 10 × coverage and loci in 99.9th percentile of coverage in each sample.We further investigate DMPs between the samples in promotors and gene bodies using multidimensional scaling (MDS) plots based on the M-value for each of the three sequence contexts (CpG, CHG, CHH) separately with the R package edgeR.The M-value is a common measure of the methylation level and is defined as M = log2((Me + α)/(Un + α)) where Me and Un are the methylated and unmethylated intensities and α is some suitable offset to avoid taking logarithms of zero.In the MDS plot the distance between each pair of samples represents the average logit change between the samples for the top most differentially methylated loci between that pair of samples.

Common environment experiment Design
The design of the common environment experiment is described in Vanden Broeck et al. [12].In brief, during the winter of 2016-2017, dormant 1-year old twigs were collected on 94 adult putative Lombardy poplar trees (hereafter called; ortets, F 0 -generation) located at 37 different sites in Europe and Asia, including those that were used in the WGBS analysis.The twigs were shipped by express mail to the Research Institute for Nature and Forest (INBO) in Geraardsbergen, Belgium and stored in the fridge at 4 °C until the establishment of the greenhouse experiment in March 2017.Out of the 94 putative Lombardy poplar trees, 72 were identified as "true" Lombardy poplars based on the multilocus genotypes obtained with 11 nuclear microsatellite polymorphisms.The poplar trees identified as "true" Lombardy poplars include the most common genotype that was shared by most of the sampled trees (G01; see Vanden Broeck et al. [12]) and two other multilocus genotypes (G07 and G14) that differed from the most common genotype for only one out of the 22 alleles at locus WPMS05 (mismatch of one repeat).
In March 2017, up to 14 (mean 12.8, range 4-14) cuttings (hereafter called; ramets, F 1 -generation) per donor tree were planted in trays filled with standard potting soil and were grown under similar light and temperature conditions in an open greenhouse at INBO.After the first growing season, the ramets were cut back and new shoots resprouted on 2-year-old roots in spring 2018.After the second growing season, in February 2019, the ramets were transplanted from the trays to larger, single pots (volume ~ 10L) with potting soil (50% white peat / 50% black peat, 0.12% nitrogen, 0.14% phosphorous, 0.24% potassium) and the experiment was transferred 30 km north to an open container field located in Melle, Belgium (lat.50.7000556°, lon.3.792222°).Also here, the plants were grown under similar temperature and light conditions for two more growing seasons (2019 and 2020) and were regularly watered.

Bud set observations
We combined the previously published bud set observations of 2017 with new observations of three consecutive growing seasons, resulting in a dataset spanning four growing seasons (2017-2020).In total, 793 ramets from 67 adult Lombardy poplar ortets from 28 locations and 13 countries were included in the experiment.These ortets were located across a 2117-km-wide latitudinal transect from Dalkeith (Scotland) to Salermo (Italy) and across a 1275-km-wide longitudinal gradient from Loiret (France) to Nagyvenyim (Hungary).In poplar, latitude is related to the length of the growing season as demonstrated by differences in growth cessation in autumn [51].An interactive map of the location of the ortets is available at Additional file 7.
Bud set was scored from summer to autumn, on the apical bud of a 1-year-old shoot on 1-year-old (2017), 2-year-old (2018), 3-year-old (2019), and 4-year-old (2020) roots.We used a seven-stage scoring system to cover onset and duration of bud set developed for Populus nigra by Rohde et al. [51].Scores go from 3 (growing apical meristem) to 0 (fully developed bud), in 0.5 intervals.Observations were performed once a week starting in July (except for 2017 where observations started at the beginning of August) until all plants had fully developed apical buds.Differences in the ontological status of the ramets, the original nutrient condition of the ramets and/ or micro-environmental variation may be associated with the physiological plant state which may affect the process of bud set formation.Plant growth is a good indicator for the physiological plant state in pot experiments [61].We measured the length of the last year's shoot (± 1 cm) in January-February 2019 and February 2020.During each growth period, the physiological condition of the plants in the common environment experiment was visually monitored in order to define the need for watering or pest control.Fertilization, water availability, and plant pests were controlled.Temperature modifies the sensitivity to day-length signals at growth cessation and influences the duration of bud formation in poplar [47].Data on the daily minimum temperature during plant growth was obtained from a nearby weather station (Hoeilaart, lat.50,77,673°, lon.4.46833°).After inspection of the plants' physiological state for statistical outliers for the response variable (bud set scores), none of the outliers were removed from the data.In total and over the 4-year period, we observed bud set on 785 ramets (on average, range 791-784) during 46 different dates, with a mean of 11.5 (range 8-16) observation dates per year, resulting in 35,950 data points.

Data analysis
Using R version 4.1.1(R Core Team, 2021) and the package glmmTMB v1.1.5[62], we performed a linear mixed effects analysis of the relationship between bud set phenology of the ramets in the common environment and the latitude of origin of the home site of the Lombardy poplar parent-of-origin.Bud set observations over the different years were considered as independent.To account for non-independence of the bud set scores of the ramets from a single ortet, we aggregated the data for the ramets per ortet by averaging the bud set scores per ortet.The different ortets (parents-of-origin) are represented by the variable ID_TREE.We used the variables ID_TREE nested within Location as random intercepts for the model.This accounts for the correlation between ortets (ID_TREE) of a single location.To account for correlation in the time series of bud set scores (they can be considered as stationary series), we use an autoregressive function (AR (1)).An AR(1) model estimates the correlation between two observations in adjacent time steps.The following fixed effects (with interaction terms) were included into the selected model; the temperature sum of daily minimum temperature at the common environment experiment calculated from 1 July (Tsmin), the latitude of origin of the parent-of-origins (Lat, as a proxy for climate and photoperiod) and the year of observation (Year).The year of observation (Year) was included as a factor variable (four levels), the other predictor variables were continuous variables.The year 2017 was set as the reference year.The variable Tsmin was centered and standardized (called Tsminc), and the variable Lat was centered on 50 (range − 9.2 to 5.8, unit = 1 degree).For the sake of simplicity, only two-way interactions were taken into account.The significance (p-value) of fixed effects was tested by likelihood ratio tests of the full model with the effect in question against the model without the effect in question (null-model) using ANOVA.To avoid overfitting and for simplicity, we also used the Bayesian information criterion (BIC) in addition to the likelihood ratio tests, to select the optimal model.

Fig. 1 .
Fig. 1.A Lombardy poplars.B Location of the collection sites of the Lombardy poplar ramets included in the DNA pools.The number of trees included in each DNA pool is indicated on the map.The DNA pools were categorized in three groups for the analysis of differential methylated regions.The grouping was based on the mean temperature in January at the location of the Lombardy poplar ortets calculated for the period 1965-2015 (data from CRU TS Version 4.05).(green; 1.83-3.41°C;blue; − 1.40°C to − 1.18°C, purple; 4.97-8.02°C)

Fig. 4 .
Fig.4.A Genes in differentially methylated regions (DMRs) per sequence context in gene bodies and in promoters.Samples of 16 individual Lombardy poplars grown in a common environment were grouped based on their common ortet; "HUN4, " "ITS3, " "SPC1, " "UKD2" located in Hungary, Italy, Spain, and the UK, respectively.Different colors represent different pairwise comparisons between groups.B Gene ontology classification.Bar chart representing the functional categorization of the annotated DMRs (p < 0.05, category; biological process) per sequence context in gene bodies and in promoters.Information on the ortets is given in Table1

Fig. 5 .
Fig. 5.A Fitted values for bud set score plotted against the cumulative daily minimum temperature during plant growth.The ramets originated from Lombardy poplar ortets growing at different latitudes.Bud set score three; apical shoot is fully growing, bud set score zero; the apical bud is formed.Confidence intervals are indicated in grey.B Fitted values for bud set score per year plotted against the latitudes of the ortets.Tsminc the temperature sum of daily minimum temperature at the common environment experiment calculated from 1 July, centered and standardized

Table 1
Leaf samples collected for the first methylome study from individual Lombardy poplar ramets

Table 2
Effect sizes of the variables maintained in the linear mixed effects model affecting the bud set phenology Tsminc the temperature sum of daily minimum temperature at the common environment experiment calculated from 1 July, centered and standardized; Lat the latitude of origin, centered on 50, of the parent-of-origin.Significant p-values (< 0.05) are indicated in bold.p-values of main effects in the presence of an interaction are not to be interpreted but are retained in the model